DOCUMENT RESUME 



ED 228 316 

AUTHOR 
TITLE' 

PUB DATE 
NOTE/ 



TM 830 242 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Bluraberg, Carol Joyce; And Others 
Comparison of Methods of Data Analysis in 
Nonrandomized Experiments. 
Apr 83 

31p.; Paper presented at the Annual Meeting of the 
American Educational Research Association (67th, 
Montreal, Quebec, April 11-15, 1983)-. Research , 
supported in part by a grant from the University of 
Delaware Research Foundation. 
Speecl*es/Conf erence Papers (150) — Reports -< 
Research'/Technical ( 143 ) 

MF01/PC02 Plus Postage. 

Comparative Analysis; *Control Groups; *Data 
Analysis;* pata Collection; Evaluation Hethbds; 
Mathematical Models; *Research Design; Research 
Methodology; *Simulation 

*Monte Carlo Studies; *Nonrandom Selection 



ABSTRACT < 

Various methods have been suggested for the analysis 
k pf data collected in research settings where random assignment of 
subjects to grpups has not occurred. For the purposes of this paper 
the set of allowable nonrandomized designs is made up of those 
research designs whera data are collected ' for one -or more groups of 
subjects at twd or more time ppints on some measure of interest. 
Further, none of the groups need be a control group. The main purpose 
of the paper is to describe and report the results of a Monte Carlo 
simulation study that was carried out to determine which of several 
data analysis methods detreloped^ by either Blumberg and Porter or 
Olejnik yields the best point estimates of treatment effects under 
various constraints. When growth on the measure of interest is linear 
over time, Blumberg and Porter's methods provide the best estimates. 
When growth is exponential , over time, ihe results are mixed: under 
some constraints Ole jnik' s *method>is best, but usually Blumberg and 
Porter* s methods provide the best estimates. (Author) 



****$******************************^ 

* Reproductions supplied by EDRS are the best -that, c^n be made 

* from the original document. 
********************************* W 



ERLC 



Comparison of Methods of Data Analysis 
in Nonrandomized Expea/iments 



Carol 'Joyce Blumberg 
Sigurd L. Andersen, Jr, 
Roberta E. G. Murphy 
Linda D. Waters 

University of Delaware 




Printed in U.S.A. 




U S. DEPARTMENT OF EDUCATION 
NATIONAL INSTITUTE OF EOUCATION 
EDUCATIONAL RESOURCES INFORMATION 
CENTER <ERIC) 
This document has been reproduced as 
receded from the person of organisation 
originating it 

Mtnor changes have been made to improve 
reproduction quality 

Points of view or opinions stated in thrs docu 
m«nt do not necessarily represent oHkmCNIE 
X 9 

position or policy 



"PERMISSION TO REPRODUCE THIS \ 
MATERIAL HAS BEEN GRANTED dY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



Paper presented at the 1983 Annual Meeting of the American 
Educational Research Association, Montreal, April 19.83. 



The research for this paper was supported in part by a grant from, 
the University of Delaware Research Foundation to Carol Joyce 
Blumberg. - 



2 



i . • Abstract r 

. Various methods have been suggested for the analysis of 
data collected in research settings where random assignment 

of subjects to groups has not occurred. For the purposes* of 

*■ * 

this paper the set of allowable nonrandomized designs is* 
made up of those research designs where data are collected' 
for one or more groups of subjects at two or more time points 

on some measure of interest. Further, none of the groups 

/ 

need be a control group. * The main purpose of the j^aper is 
to describe and report the results of a Monte Carlo simula- 
tion study that was carried out to determine' which of several 

data analysis methods developed by either* Blumberg and Porter 

i 

or Olejnik Yields the best point estimates of treatment 
effects under various constraints. When growth oh the measure 
of interest is linear over time Blumberg and Porter's methods 
provide the best estimates. When growth is exponential over 
time the results are mixed: uncjer some constraints Olejnik's 
method is best,V but usually Blumberg "and Porter 1 s methods 
provide the best estimates. 



*■ 3 



Various methods have been suggested for the analysis of 
data collected 1 in research settings where random assignment 
of subjects to groups has not occurred. For the purposes of 
this paper the set of allowable nonrandomized designs is made 
up of those research designs where $ata are collected for one 
or more groups of subjects at two or, more time points on some 
measure of interest. Further, none of the groups need be a 
control group. The designs making up this set are most often 
referred. to as either nonequivalent control group designs 
and/or interrupted time series designs (Cook & Campbell, 1979). 
The main purpose of this paper is to report the results of a 
Monte Carlo simulation study that was^ carried out by the au- 
thors, to determine which of the several data analysis methods 
to be described in the next section results in the best point 
estimators of treatment effects under the various conditions 
studied. 

Data Analysis Me thods 

-^7 ■ ■ 

All of the data analysis methods to be compared in this 
paper assume some type of continuous natural growth model 
which is supposed to describe the cfianges (i.e. , growth) in 
the measure of interest, dver^time. frlumberg (along with Porter) 
has developed several methods for deriving point estimates 
of treatment effects (Blumberg, 19*82a; Blumberg, 1982b; 
Blumberg & .Porter, 1982). All of -these methods assume the \ 



f 

Xi. — t - " - , J = ^ s ^r^ =sst£Ssit ^ d ^^^ 



-2- 

following j/odel 0f growth over time: 

J , * ' 

**j(t) * 9j(t) .y* (t 1 ) + h.(t) + aj (t) 

and * (1) 

. Y ij (t) = Y ij (t) + e ij (t) ' ; • 

. where Y^ (t) , Y^^ (t) and e^ (t) represent the true scores, 

observed scores, and errors of measurement, respec- 
tively, for the ith indiviudal in the jth group, on 
"the measure of interest; 

g^(t) and h..(t)'are continuous functions; 

(ty represents .the population treatment effect 
for the jth group; 

» 

and t^ is an arbitrary time point. 

Further assumptions are; 

(1) Classical measurement theory hoids. -That is, for 

* , • 

each time t Y.(t)and e.(t) are uncorrelated a#d E(e..(t)) « o. 

and (2) Treatment 'effects are additive, ThQ expression 
g^ (t) •Y^ (t) + hj (t) represents the natural growth portion of 

tl\te class df models and represents any natural growth situation 
where there is a correlation within each group between true 
scores at any two points in time* Finally, th^ .treatment effects, 
as defined" by the a. (t) , s in the system of equations (1), are 

not the same as the usual definition of treatment effects. 



9 

ERLC 



■ \ 



. 1 



-3- 



Let ct(t) be the grand mean of the o. (t)*s.. The usual definition 

T 

of a treatment • effect is given by a.(t) — a(t). 

( 3 

All that is required in order apply Blumberg and Porter \s 
methbds is that the functional forms of the h.(t)'s are known 
(e.g., h^(t) is a logarithmic function of the form 

h, (t) = l^(c (t-tj + 1), w*here b and c are constants, possibly 

1 D 1 

unknown; r.jtt) is- a linear function of the form h 2 (t) = c- (t-t^) 

whejte c is ^ome constant, possibly unknown; etc). /In this paper 
three of Blumberg and Porter's methods will be described and 
used in the simulation study* The reason for not discussing < 
the remainder pf their methods is.* that the remaining methods 
are not applicable under the conQitions imposed for this par- 
ticular simulation study. * *N 

Blumberg and Porter's first method requires that the data' 
analyst have knowledge of the functional forms of the gj(t)'s 

and hu(t)'s. It further requires that pretest observations 

under natural growth conditions are available on the measure 
of interest at at least M .time' points, where M is the'maxifium 
of (i) two more than the number of unknown constants in the 
functional form expression for g . (t) ; and (ii) two more than 

the. number of unknown constants in the functional form expres- 
siol^for hj(t). For convenience, this method will be called 

Method A and. p will denote the number of pretest tinje points. 



ERLC 



ft 



\ 



. < ■ 



-4- 



If one considers the system of equations (1) for each of the p 
pretest points, then the structirgi model depicted in Figure 1 

'can be set up relatingtihe pretest observations at the various 
time points. In this flcjure and 1^ the remainder of the "paper, 
without loss of generality, the j subscript tepresenting group 

'membership is dropped. This structural, equations model contains 
many unknown parameters, namely g(t2) # 9^3^' ••• ' 9^p)' hftj)* 

hft^), ... r h(tp), the variance of the true scores at time t-^, 

and 'the variances of the errors of measurement at t^, t 2 > , 

tp.-l' and t p . 




i ' 

\ Figure 1 

Pictorial representation of the structural model 



ERIC 



t 



To implement Method A it is necessary to obtain maximum 

4 

Tikelihood estimates of these unknown parameters. But, the* 
structural model is overidpntif ied and hence does not have a 
closed solution for the maximum likelihood estimates of the 
parameters. Consequently, LISREL (Joreskog & Sorbom, 1978) 
or some other maximum likelihood structural equations computer 
program must be used to obtain the maximum likelihood estimates. 
The Appendix gives jthe LISREL IV input stream corresponding 

to Figure 1. Let g(t k ) represent the obtained maximum like- 

lihobd estimate of g(t k ) for k = 2,3, /p . The maximum 

likelihood estimates of * the h(t k )'s for k=2,3,...,p are obtained 

by using h(t R ) - YTt^) - g^^Y^) . The gTt^'s and\the ^ 

h(t^)'s thus provide estimates ^or the true values of g(t) and 

h(t), respectively, at the pretest time points., The method of 
$east squares is then used to obtain estimates of the unknown - 
constants in the functional form expressions for g(t) and h(t). 
For example if g(t) = b*c^ t-t l^ + (1 — b) , then the estimates 
of b and c are those values which minimize the quantity 

^ Z (g(t') - (b-c (t_t l ) + (l'-b))) 2 . 
. k=2 . K » 

* 

New functions, labelled g(t) and fi(t) are formed by substituting 
the estimates of the unknown constants, that were obtainted using 
the process just described, back into the functional form 



-6- 



expressions for g(t) and h(£). For example, if g(t) = b-c^ fc fc l^ 

+ (1 - b) and b = 1.45 and c = f.7, then g(t) • 1.45*(2.»*" t i^- 

Point estimates of treatment effects are .finally given tinder 

'' ■ ■ ( 

Method A by " 

a^t) = YTt) - (g (t) • YTt^V"*** n(.t)) . 

Blumb£rg and Porter's second method, to- be 'called Method B, 

depends ^ipon assuming that the reliability of Y v the measure 

of interest, is constant over time and upon having knowledge . 
♦ 

of the exkct nature of h(t) [e.g., knowing, that h(t) = 3 # t or 
^ * . 7 

h(t) = log^Uft-l) ], etc.] and requires observations at only 

one pretest time point, namely t^. Under Method B, point 

estimates of treatment effects are given by/ where Sy(t) repre- 
sents the standard deviation of'Y(t), 



a B (t) = Y(t) 



Js Y (t) 1 

- 1— - -Y(t.) + h(t)[ 

. Cs y (. tl ) J 



Blumberg and Porter's third method, to be called Method C, 

depends upon assuming that the reliability of Y is constant 

over time and upon having knowledge of the functional forms of 

g(t) and h(t). Further, both g(t) and h(t) can each only have 
t 

one unknown constant -(e.g., g(t) = b* (t-t^) + 1 and lr(tr)* 

7 ' • ■ " 

log^ [c • (t-t^) ]). Method C also .requires that pretest obser- 



vations are available at two pretest time points, say t^ a 



nd 



t2* This method is a combination of some aspects of Methods 



9 



-7- 



ERIC 



Y 2 

A and B. Under Method C, g(t 0 ) can be estimated by — ,.\ . 

Y 1 

Call this estimator of g(t 2 ) by the name gtt^) • The.value 

of H(t 2 ) is thfen |Stimtted by using Mt^) = Y(t 2 ) - g(t 2 )-Y(t^). 

ThQ equations g(t 2 ) = and ^^2^ = ^(tj) are then * solved 

for the unknown constants • These solutions provide estimators 

of the unknown constants-. For example, if g(t) = b- (t — t^) + 1 

S (t ) 

, then the equation g ^ ^ = b* (t 2 — t^) + 1 is solved for b 

. yielding b c ^ g ^ t j lj/ (t^-tj) . Once the estimates of the ' 

unknown constants are obtained, ^new functions labelled g(t) and 
h(t) are formed, as in Method A, by substituting the estimates 
of the unknwon constants into the functional form expressions 
for g(t) and h(t). Point estimates of treatment effects are 
then given by 

cQt) = Y(t) - [g(t) -YTt^) + h(t)] \ 

Olejnik. (1977) assumes the following model for the mean 
population growth over time on the measure of interest: 

— y Y <^) »~ [b- (-b t 1 ) + 1} (t^ + c< (t ± 1 i*-* ct(t) 

and ' * 

.Y. (t) = Y. (t) + e. (t) , •■ i 

* y 

where |>y(t) is the population mean for Y (t) . Hence, Olejnik 
requires that the papulation mean natural growth over time be 



-8- 



linear while Blumberg and Porter allow natural growth to follow 
any continuous function, Olejnik, however, does not require 
the assumption of a correlation of +1 between true scores at 
any two points in time, as is required by Blumberg and Porter's 
model of natural growth, Olejnik 's method, as did Blumberg and 
Porter's Method C, requires observations to be available at 
exactly two pretest time points, namely and t 2 . Under 

Olejnik's method, the point estimators of treatment effects 
are given by 

- * t - t. 

0 Q (t) = Y(t) - Y(t 1 ) - [Y(t 2 ) - YTtJ)] — ± # 

t 2 t l 

All of the four methods just described have some unsolved 

problems associated with them. * The methods developed by Blum- 

— # 

berg and Porter are based on maximum likelihood estimation 
and/or the use of ratios of standard deviations. Both maximum 
likelihood techniques and estimators based on ratios of standard 
deviations are known to often lead to -biased, although con- 
sistent, estimators. One unsolved problem is whether each of 
Blumberg and Porter's methods lead to estimators whose bias is , 
at an acceptable., ot unacceptable level. Further, nothing is 
known about the standard errors of the estimators generated by 
these methods. Olejnik* s method has only been studied when 
the population natural growth pattern was taken to be linear,, 
ovej: time. It can easily be shown by elementary algebra and 



rV 



id / 

ERIC 



statistics that when population mean growth is linear that 
Olejnik' s method produces unbiased estimates of treatment 
effects. Olejnik (1977)^ studied the standard error of his 
method for* linear mean population growth under various con- 
straints on the errors of measurement. The bias and standard 
error of Olejnik 's method have not, however, been studied 
for non-linear mean population growth. The computer Simula- 
/tion study to be described presently, thus, had several 
purposes: 

(i) to study the bias of Blumberg and Porter's Methods 
A, B, and C and Olejnik 's method under various natural growth 
formulations; 

(ii) to study the standard errors of the four methods; 

(iii) to compare the estimates obtained under the four 
methpds; 

and (iv) to make recommendations for the use of these methods 
'oh real data sets. 

Inhere is only one other class of methods known to the authors 

M 



'4'hy which one can obtain point estimates of treatment effects* 
[*, This class of methods which was developed by Strenio, Bryk, and 



i 



tf Wefisberg (Bryk, Strenio, & Weisberg, 1980; Strenio, Weisberg, & 
^ Bryk, in press) is based on the ideas of Empirical Bayes esti-""" 
mation. The use of their class of methods demands a. great deal 
of mathematical and statistical sophistication on the part of 
the^£ata analyst. Hence, even though Strenio, Bryk, &' Weisberg 



ERLC 



1 o 



' -10- . . . ' 

h^ve produced an excellent class of methods, their methods were 

not included in this simulation study because of their com- 

* i 
plexity.. t 

» 

f 

Set Up <5f Simulation Study * 

T ' 

One thousand two hundred data set^ were generated in the 
following manner. . First, the canned p;ro£ram^NRAN31 was used 
•to .generate 13 standard norrnal random deviates for each .of 
25 individuals. This .program and all remaining programs men- 
tioned in this paper were rjjn on the Burroughs 7700 Computer 
at the University of Delaware. A base true" score, for each 
individual was established by adding 5 to the first standard 
normal random deviate generated for each individual. With-* 
out loss of' generality, this time point was set equal to - . 
£ = X. Two different sets of individuals 1 true scores under 
natural growth ever time were generated at 11 additional 
time points, which were taken to be equally spaced at t=2,3, 

11, and 12 , using Y- (t) = g(t)-Y*(l) +'h(t) where 
g(t) and h(t) were certain specified functions. The* fijrst 
set of true scores was . generated by .setting g(t) = .5(t — 1) + 1 
and h(t) = . 3*"(t — i) . The second set w of true scores was 

generated using g(t) = .7*(1.2) t-1 + .3 and h(t)=0 . Next, 
it was assumed that the reliability of Y^was constant across 
time. Three different values were taken for this Reliability: 
.5 , .1 t and .9 . For each of these reliability values the 



13 



9 

ERIC 



second through thirteenth standard normal random deviates gen- 
erated at the *£irst step were used to 'add on errors of measure 

* f - * 

i * 

ment to the true* scores in order to generate observed scores 
with the required reliability values. Thus, for each set of 
^^25 individuals,, six different data^ets. were generated. -The 
^ properties of the six data sets are enumerated below: 

(1) The first data set, which" 1 be referred' to as .5 Linear 
was generated using g(tj- = . 5J>t-I) + 1, h(t) = . b-t, and a 
reliability of .5 . 

(2) The second data set, which will be referred to as .7 

* 

Linear, was generated using gft) = .5(t-l) + l,-h(t) = . 3*t, 
and a reliability of . 7, . 

(3) The third data set, which will be referred to as .9 
Linear, was generated using g(t) = .5(t-l) + 1/ hCt) = .3«t, 
and a reliability of .9 . . 

' (4) The fourth data set, .which will be referred to as .5 
Exponential, was generated using g(t) = .7« c (1.2) t 1 + .3, 
h(t)= 0, and a. reliability o£ .5 . 

(5) The fifth data set, which will be referred to as .7 
Exponential, was generated using g(t) = ,7«41.2) t 1 + .*3, 
h(t)= 0, and a reliability of .7 . ' 

(6) The sixth data set, which will be referred to as .9 
Exponential, was generated uspng g(t) = .7* (1.2)^ 1 .+ .3, 
h(|t)= 0, and a reliability of .9 . 



-12- 

The procedure just described in the preceding paragraph % 
was repeated 200 times yielding a total of 1200 simulated data 

m 

sets. Noti.ce that when the data sets^ were generated no treat- 
ment effects were entered into the data.. Hence, when the four 
methods described in the last section are used to estimate 
treatment effects, the calculated values of the estimated treat- 
ment effects do in fact represent the bias in the^methods^ because 
.the theoretical values of all treatment effects were set to 
zero. 4 

For the .5 Linear, .7 Linear, and .-9 linear data sets the 
estimates of treatment effects for the various methods were 
•calculated in- the following manners. For Method A the time 
points t=l,2,3,4,5, and 6 were taken as the. pretest time points. 
The simulated observed scores for each data set corresponding 
to these six time pointSv were entered into the LISREL program 
illustrated in the Appendix.' The LISREL estimates of GA(1,1), 
GA(2,1), GA(3,1), GA(4,1), and GA(5,1) were then used as the 
maximum likelihood estimates of g(2), g(3), g(4), g(5), and 
$(6), respectively. It was then assumed that g(t) = b*(t — 1) 
+ 1 and that h(t) = c* (t -1). The method of least squares 
was then used to estimate b and^c. in this cas§, because both 
g(t) and h.Ct) are linear , - closed expressions for £ and c are 



available and are g^ven by b =(Q - 15)/55 and by c = ,' 
(YT7) + 2YT7) + 3YT7) + 4YT5") + 577?) .- Q-YTT) )/55, where 



-13- 



Q = gU) + 2-g(3) + 2f«g(4) •+ 4-g(5) + 5«g{6) l Finally,, a. (t) 

r A 

was calculated for 1=7,8,9,10,11, and 12 using aTTt) = 

r A 

Y(t) - (b(t - 1) + 1) -yTT) - c- (t - 1) . For Method. B it was 
\ ^assumed that h(t) = . 2^t (the correct function) and t = 1 was 
taken as the only required pretest time point. To calculate 



the a B (t) f s the formula a B (t) = Y(t) - ( (S y ft) /S y (1) ) • YTT) + .3-t) 

r was used for t=2, 3,4,5,6, and 7 . For Method C and for Olejnik's 
method it was assumed that g(t) = b* (t -1) + 1 £nd h(t) =.c«t 
and the pretest time points were taken as being t=l and t=2. 
When these linear functions are assumed for g(t) and h(t), 
Method C and Olejnik's method result in the same* estimates for 

treatment effects. For ease of slater disdussicpn, .these estimates 

* * * .""./. " * . 

will be referred to as the estimates from Ole jnjllc^s irtethod' and * 

are given'by oT(t,) = YTE) !- -YTT) - (Y(2) - YTU) • (t - 1) for 

t=3,4,5,6,7, and 8. 

For the -5 exponential, .7 exponential, and .9 exponential 
data £ets the estimates of the treatment effects were calculated 
*in the following manners. For Method A the time points t=l,2, 
3,4,5, and 6 were taken as the pretest time points and the g(t) '5" 

for *t=2, 3, 4, 5, and 6 were generated using LISR£L as described in 

• * (t-lj 

the previous paragraph. It- was then- assumed that,g(t) = b*c '+ 

•'"■*, * ' 

(1 - b) . Since h(t) was set to be identically equal to zero 

when generating the data sets, h(t) was assumed to be identically 



ERIC . ^ 



equal t6 zero. for Method A and for all the other methods when 
simulating the analyses methods for x the Exponential data sets. 
.The method of least squares, using the ZXSSQ subroutine of the 
IMSL package, was then employed to estimate b &nd c, and oTTt) 

was calculated using a. (t) = Y (t) - [b*c^ -1 * + (1 - b)]*YTT) 
m for t=7,8,9,10, 11, and 12 . For Method B, t « 1 was used as the 
pretest time point and a_(t) was calculated using the formula 

_ . s (t) ^ • . 

a B (t) m Y(t) s (1> * Y(1) for t«2,3,4,5,6, and 7. Method C 

is not applicable for the exponential data .sets since g(t) = 

b*c^ fc ^ + (1 — b) / which is the corresponding functional form 
for the g(t) used to- generate the data sets, has two unknown » 
constants. For Olejnik's method the time points of the pretests 
were' taken as t=l and t=2 and the formula o^At) = Y (t) - Y CD — - 

(YC2) ~ YU))Mt - 1) for t=3,4,5,6,7,and 8 was still jised to 

; ' ' - ' " I 

estimate the treatment effects, even though it was realized that 

Olejnik's assumption of population mean / growth being .linear does 

not hold for these data set?. 

j ' 
Results and Conclusions 



The easiest way to report the results of this simulation 
study is by- the use of tables. Tables i through €• give the 
resists for the .5 Linear, .7 Linear, .9 Linear, .5 Exponential, 

1 n' • . 



-15- 

i 

.7 Exponential, and .9 Exponential data sets. As was mentioned 

Insert /Tables 1 to 6 Here 

earlier, the observed mean for each of the estimators over the 
200 simulated data sets is the same as the observed bias of 
these estimates since^the theoretical value of the treatment 
effects is zero. %tfhis observed bias is reported in each table 
in the column labelled Observed bias. 1 The standard deviation 
of each of the various estimated treatment effects over the 
200 data sets is an estimate of the standard error of the es- - 
timators and is reported in the column of each table labelled 
standard deviation'.^ j^^ach table the N ^alumn labelled Percentage 
best repr^ ' nlUmf • times that the indicated method 
yielded an^^^OTated treatment effect whose absolute value was 
less than the absolute value of the estimated treatment effects 
generated using the other two methods. Conversely, the column 
labelled Percentage worst reports the number of times that the 
indicated method yielded an estimated treatment effect whose 

r 

absolute value was more- than the a^olute value of the estimated 
treatment effects generated using the other two methods. The 
rows labelled A and B refer to Blumberg and' Porter f s methods 
and the rows labelled 0 refer to Olejnik's method. The starred 
values in Tables 4,5, and 6 are crude estimates of the observed 

■ 18 - . ' . ." 



9 

ERIC 



-16- 



bias and standard deviations rather than the actual values. To 

* keep, the computer programming tractable, values 6f estimated 

t * 

treatment effects which wer*e smaller than -1000 were treated 

as missing when the observed bias and standard deviations were 

, computed. Hence the observed biases* are even more negative than 

indicated and the standard errors are even bigger than indicated. 

The, reason for including the crude estimates' of bias and standard 

deviation is that they do give an indicatiojji of the problems 

associated with Method A when exponential grdwth is used. » 

F^bm inspection of Tables 1 through 6 several conclusions 

can be drawn. When g(t) and h(t) lare linear (Tables 1,2, and 

3), Method A leads to point estimates of-- treatment effects wh;ich 

appear to have no noticeable bias while Method B leads ,to biased 

estimates. 01ejnik f s method theoretically leads to unbiased 

estimates and- this was confirmed by the simulation 'study . Method 

B has much larger standard errors than either Method A or Olejnik^s 

method. Further, Method B, for all reliability levels ajid for 

_ r 
aXLposttest time points , rarely gives estimates with" smaller 

absolute value (i.e., Percentage best is lower) than either 

Methpd A or Olejnik's method and, in fact, most o£teh, yields 

the estimates with the largest absolute value (i.e., Percentage 

worst is high). Hence, Method B can be eliminated as a possible 

method for analyzing data which follow a linear growth pattern 

over time. Therefore, the choice of data analysis methods for 

linear growth is reduced to Method A and Olejnik's method. 



9 

ERLC 



-17- 



. Since both Method A and Olejnik's method lead to virtually 
* unbiased estimators, the choice between them must bfe made based 
on considerations other than bias. When one extends one time 
point beyond the last pretest for all three reliability levels 
the standard error for Olejnik's method is less than the standard 
error for Method A, and further, Olejnik's method leads to smaller 
absolute values of estimated treatmetn effects a larger pejrceft- 
tage of the time. When one extends two time points beyond the 
pretest for all three reliability level the standard errors 
and Percentages best and worst are approximately the same for 
both methods. When one extends threft^or more time points heyond 
tire pretests for all three Reliability levels the Standard errors 
and Percentages worst. .are smaller and the Percentages best are 
larger for Method A than for Olejnik's method. Hence, for 
measures of interest whofee true growth pattern over time is 
linear, it appears that. if one wants to extend only one time 
point beyond the pretests that Olejnik.'s method should be used. 
If one wants to extend two time points beyond, it appears to 
be a toss-up. But, Olejnik's method is much .simplier to use 
and hence is recommended when extending two time points beyond 
the pretests. When one wants^to extend 3 or more time points , 
beyond the pretests, Method A appears to be the preferrable 
method. . * 

When the true growth pattern on the measure of interest 
follows the exponential growth model (Tables L 4, 5, and 6), 



* \ ' 20* 



9 

ERIC 



-18- 

Method A can immediately be^ eliminated as a' possible data anal- 

ysis method because of its huge standard errors. Hence, when 

data follow an (exponential growth model the choice of data 

analysis me^hodfe is limited to Method 6 and Olejnik's method. \ 

For all 3 reliability levels ind when one extends any number 

of time points beyond the pretest time points Method B appears 

to give estimates of treatment effects with no noticeable bias 

while Olejnik's method always leads to biased estimators. The, 

biasedness <^f Olejnik's method should not, however, be surprising 

since the method assumes linear growth and the growth, model 

used to generate the data" was not linear. When onQ extends: 

'only one or tw© time points beyond, the pre-test time points 
\« 

for all threje reliability levels Olejnik's method has smaller 

, ' • v j 

standard errors and Percentages worst *and larger Percentages 
best than does Method^ B. Hence, despite being slightly biased, 
.Olejnik's method appears to be the preferrable method when 
extending only .one or two time- points beyond the pretests. 

When extending three time points beyond the pretest tim< 
points the choice of methods, is dependent on the reliability 
\of the datal When the reliability is .9, Method B 'has a 
larger Percentage bes£ and smaller Percentage worst than 
Olejnik's method. Further Method B provides virtually-^r^iased 
.estimates while Olejnik's method leads to biased estimates. 
Hence when extending threje time points beyond the pretests, 
"with data of reliability of .9, Method B is the preferred 



21 




-19- 




9 

ERIC 



method, eve^ though Olejnik's method has a smaller standard 
error. When .the reliability -of the data is either »5 or .7 
and one is extending three time points., beyond the pretests, 
.the Percentages best and worst are almost identical for .Method 

B and for Olejnik's metho'd* As mentioned earlier, Method B 

» * 

f * * 

leads to virtually unbiased estimates while Olejnik's method 

leads to biased estimates. * However, the standard errors : 

associated with Olejnik's method are smaller. So, when the 

■ reliability of the data is either .5 or" .7 and 'one is extending 

three time points beyond the protests, data analysts must 

decide whether they want* unbiasedness., in » which case Method 

B should be chosen, ''or smaller standard errors, in which case 

» * 

Olejnik's method should be chosen. When extending four or more 
time points beyond the pretests Method B becomes the recommended 
method for all three reliability levels. The reason for this." 

r '** " 

* * - V. 

recommendation is that Method B remains virtually unbiased 
while the bias inherent is Olejnik's method becomes larger as 
the' data is extended more and more time .points Ipeyond the 
pretests. Further, the Percentages best are larger and the 
Percentages worst art ' smal t i^r f or Method B than -for Olejnik's * . 
method. ' * 

Limitations and Directions for Further Research 

i * 

This simulation study was- carried out using only 'twp f otms , 
of natural growth models— a linear growth model and ah exponential 



On 



growth model. Further, for each of the two models only one set 
of parameters was used to generate the data. Also, the assump- 
tion of equal reliability across time was made. Hence, the 
results reported in this paper are very limited. On a positive 
note, however, the results of thj^s simulation study show that 
Blumberg and Porter's methods and Olejnitt's method are viable 
data analysis methods. This is important because these methods 
are not well known and hence have rarely been used in- actual 
data analysis situations. Because of the limitations just 
cited, much further research needs to be done. First, for 
the functional forms studied here, parameter values other than 
those used in this 'study should be investigated. Second, 
functional forms other than linear or exponential should be 
studied. Third, constraints on the errors of measurement 
other" t^Skn that of equal reliability should .be included in 
future simulation studies. Finally, this study only concerned 
the point estimation of treatment effects. Both Blumberg and 
Porter (Blumberg, 1982a; Blumberg, 1982b; Blumberg & Porter, 
1982) and- Olejnik (1977) have developed interval estimation 
and hypothesis testing procedures based on their point esti- 
mation procedi^res.. The utility of their interval estimation 
arid hypothesis testing procedures still needs to be studied. 



-21- 



Ref erences 



Blumberg, C. J. Methods for data analysis in nonequivalent control 
group designs. American Statistical Association 1982 Proceedings 
of the Social Statistics Section , pp. 197-202, 1982a. 

Blumberg, C. J. The estimation and hypothesis testing of treatment 
effects in nonequivalent control group designs when continuous 
growth models are assumed . Unpublished Ph.D. dissertation, 
Michigan State University, 1982b. 

* 

Blumberg, C. J. , & Porter, A. 'C. T^he- estimation- and hypothesis 
testing of treatment effects in nonequivalent control group 
designs when, continuous growth models are assumed ^ American 
Educational Research Association annual meeting, March 1982 
(ERIC Document ED 220 516/TM 820 533). 

Bryk, A. S., Strenio, J.F., & Weisberg, tf. I. A method for 
estimating treatment effects when individuals are growing. 
Journal of Educational Statistics , 1980, 5, 5-34. 

COok, T. D., & Campbell-, D. T. Quasi-experimentation: Design and 
, anal ysis issues for field settings . Chicago: Rand McNally, 
TTTT. : ; 

Joreskog, K. G. , '& Sorbom, D. LISREL IV; Analysis of linear 
structural relationships by the method of maximum liklihood; 
User's ^uide . International Educational Services, 1978. "~ , 

Olejnik, S.- Data analysis strategies for c^uasi-experimental 
studies where differential group and individual growth rates 
are assumed . Unpublished Ph.D. dissertation, Michigan State, 1977. 

Strenio, J. P. ) Weisberg, _H. I., & Bryk, A. S. Empirical Bayes 
estimation of individual growth curve parameters and their 
relationship to covariates. Biometrics , in press. 



* 

i 




-22- 
-> Appendix 

LISREL IV Input Stream 



Ti\tle Card 

DAj NG=1 NI=p ^0=8 ample size MA=CM 
LABELS 

t YT2 tl » YT 3 1 1 YT4 1 ... 1 YTP 1 1 YT1 1 

RA » 

* 

fbata is next using the following form 



Y 2 (t 2 ) Y 2 (t 3 ) 
VV VV 



vy VV 

V'V VV 



VV VV 



MO 
ST 



7„(t ) l„(tj 
N p ■ N 1 



NY=p-l NX=1 NE=p-l NK=1 LY=ID LX=ID BE=ID C' 
GA=FU,FR PH=DI,FR PS=ZE TE=DI , FR TD=DI ,FR 

Give starting values for GA(l t l) to GA(p-l, H,PU(l) t 
TE(1) to TE(p-l), and TD(l) making sure thai' they are 
all positive . * 



OU MR FD SE ND=8 



9 

ERIC 



25 



-2.3- • 

Table 1 
Results for .5 Linear 



Number of time 
points beyond 
last pretest 


*> 

Method 

■ i « 




Observed 
■ bias 


*<* 

Standard 
deviation 


Percentage 
best 


Percentage 
worst 




A V. 




V 074 


0 . 937 


35^.5 % 
11 • 5 


21 % 


1 


B 




.C96 


2.547 


* 6 ? 




0 




Al4 


- 0.738 


' 53 


10 




* 

A 






l n7"? 






2 


B 




.308 


3.490 


11.5 


71.5 




0 




-.005 ' 


1.098 


44 


15.5 




A 




• ± j ± 


X • £, X ^ 


43 v 


13 . 5 


3 


B 

V 

0 - 




.620 


' 5.176 v 


14.5* 


.-. - 68.5 






.108 


1.543 


42.5^ 


18 


r 


A 




• U UJ. 


x • J / D 


Rfi R 
jU • j 


q 


4 


. B 




.546 


6 '..4 5 2 


10.5 . 


76.5 




0 




-.053 


2.015 


39 


1 A C 
14.3 




A 


*. 


-.182 


U.479 


54.5 


7.5 


5 


B 




1.002 


7.972 


► '8 


74 ' 




o :■ 




.094 


2.293 


• 37.5 . 


18.5 




A 




.027 


1 1.677 " 


56.5 


7.5 


6 


B 




1.034 


8.653 


6.5 . 


77.5 




o v 




.087 • 


2.657 


,37 


15 



9 

ERIC 



26 



, -24- 

Table 2 
Results for . 7 Linear 



Number of time 
last pretest 


Method 


UJJSci V6Q 

bias 


oxanoara 
deviation 


Percentage 
• best 


Percentage 
•worst 




A 


.049, 


0.613 


• 37.5 % 


15 * % 


i 


. B 


.227 


1.983 • 


8.5 ■ 


76 


4 


w 


on q 


fi 7 


D *i 


Q 


% 


A 


.042 


0.703 


45.5 


10 


2 


B 


.223 


2.740 


9 - 


77 


• 


U 


— . UU J 


u • / j.y 


45. 3 


J. i 




A 


.086 


0.798 


48 


9.5 ' 


3 


B . 


, .486 . 


4.0,35. 


. 9 


75.5 




u 


• u / u 


JL«UXU 




1 c 
X D 




A 


.001 


0.900' 


53.5 


7.5 


4 


B 


.455- 


• 5.122 


.7.5 


81 




0 


-.034 


1.319 


39~ 


11.5 




A 


-.119 


0.968 


55.5 


\6 . > 


5 • 


B 


.728 


6.180 


6 


81.5 ' 




0 


.062 


1.501 


38.5 


' ' 12,5 , 




, A 


.018 


1.098 


53.5 - 


7.5 


6 


B 


> .887 


6.773 


10 • 


80.5 . 




. 0 


.057 


1.739 


36.5 


. 12 



Table 3 
Results for .9 Linear 



Number of time 
points beyond 
last pretest 


Method 


Observed 
bias 


' •» 

Standard 
deviation 


Percentage 
best 


Percentage 

worst 
f 




A 


A 1 C 

* 025 


f\ Oil' 

0 . ill 


Q "7 c q. . 
3 / • 0 $ * 


10 I 


1 ' 


B 


.122 


-1.145 


6.5 


. 75.5 




0 ' 


.005 


0.246 


56. 


8.5 
















A 


. 021 


0.358 


- - 
- 46 


8.5 


2 


B 


.102 


1.619" 


■ 8.5 


80.5 




0 V , 


' -.002 


0.366 


45.5 


11 




A 


. 044 • 


0.406 


49 -.. 


10 


- 3 


B 


.263 


. . 2.338 :"■ 


6 


79.-5 




0 


.-036 


0.514, * 


45 ' 


10.5 




A 


. 000 


' 0.458 


C C * 

00 % 


0.0 


4 


B 


.252 


3.030 


4.5 


88 




O 


-.018 


0:672 


4*0.5 


6.5 




A 


-.061 • 


0.493 


54 • 


4.5 


■ 5 


B 


, ."349 


3?560 


6 . ' 


84 • " 




o, 


• • :031 


0.764 


40 


11,5 




A 


.009 


^0.559 


56.5 


5.5 


6 


B . 


.532 


3.9.72 


6 


84 




0 


.029 


0.886 . 


37.5 


10.5 • 



4 




-26- 
Table 4 

Results for ,5 Exponential 



"Number of time 
points beyond 
last pretest . 


Method 


Observed 
bias 


- 

Standard 
deviation 


Percentage 
best 


- 

Percentage 
worst 




A 


-11.562 


55.541 


3 % 


86.5 % 




• ■ B 


■ -0.064 


1.226 


36.5 


■ 11 " . 




0 


. 0.153 


0.561 


60.5 


! 2 . 5 




A 


-57.052,* 


388.770* 


4 


88 ** 




B 


.-0.170 


1.487 


38.5- 


7 




6 


0.455 


6 0.855 * 


.57.5 * 


5 




A 


-72.352* 


429.307* 


3 


.89 




. B 


-0.098 


1.750 


■ • 46.5 


7.5 




0 


1.037 


1. 203 


50.5 


3,5 




A 


-121.758* 


467.494* 




87 


. • 4 


-B 


-0^255 


' .1.991-..-' 


■ 57. 5~ 






o 


1 70fr 


1582 


38 5 


8 




A 


. -178.686* 


766.859* 


4.5 


86 


5 - 


B 


-0.107 " 


>2.096 


71.5 


4 j 




r\ 
\J 


£ * o j4 


!• 04 j 




J. u 




A 


-218.836* 


956. 396* 


' ,5 


86 


6 


B 


-0.214 


>3.092 


■75.5 


3.5 




0 


4.225 


2.162 


19.5 


10.5 




• 


20 









© .... 
ERIC ' - 

- - - - y — — ■ I - ■ -^^^^^^^^^^ 



, • 1 ' 

. -27- - 

Table 5 ■ 
Results for . 7 Exponential 



f 

Number of time 
points beyond 
last pretest 


Method 


Observed 
bias 


Standard 
deviation- 


*• 

Percentage 
best 


Percentage 
worst ' 




• A 


A ft ft *5 

-4.993 


27.927 


.3.5 % ^ 


O C C a 


i 


B 


-0.045 


0.98.5 


28 


11.5 




• 0 


0.149 


0.367 


68.5 


3 




A 


* 

J. O O • J. O *4 , 


* 

?Q jf DOD 




85 5 


2 


B 


-0.133 


r.i99 


36 


12 




0 


0.453 


0.560 


. 58.5 


2.5 




A 


-57 . 992 


* 

- 664.887 


4 


87. 5 


3 


B 


-0.062 


1.402 


48 


8 


' - 


0 


1.009 


0.788 


"48 


4.5 






* 

— dfi R9fi 


* 

366 . 526 


4 


90.5 




. B 


-0.166 


1.569 


61 


2.5 




0 


1 "7 fi O 


■1 fi 1*7 
1. UO / 




7 

/ 




A 


-74.629* - 

* 


438.340* 


5.5 


88.5 


5 


B 


-0.'095 


1.697 


80 


2 




0 


2.S04 


1.212 


14.5 


9.5 ^ 




A 


-83.196* . 


407.534* 


4.5 


' 86.5 


6 


B 


-0.093 


2.418 


85 


1 




0 


4.194 


1.423 


10.5 


12.5 



9 

ERIC 




30 



-28- 



Table 6 
Results for*. 9 Exponential 



/ 



Number of time 
9 points beyond, 
last pretest 


Method 


Observed 
bias 


Standard 
deviation 


Percentage 
best 


Percentage 
worst 




A ' 


-3.011 

« 


21.806 


5.5 i 


79 % 


1 


B 


-0.016 


0.598 


23 


19 v 




* -o 


• 0.144 


0.187 


• • 71.5 


2 




A 


-117.414* 


* 

811. 526 


7 


85 




B 


-0\070 


0.721 


42 


9 




0 


0.983 


0.403 


51 


6" 




A 


-3.878* 


27.86i* 


- 8 


83.5 


3 . 


B 


-0.016 


0.843 


'64.5 


•6 




0 


1 0.983 


0.403 


27.5 


10.5 




A 


-15.978* 


• 123.460* 


5.5 


85 


' 4 . 


B 


-0.066 


0.939 


/' 86 


1 




0 


1.704 


0.532 

1 


8.5 


14 




A 


-62.224* 


560.774* 


5.5 


H|— 

85 ' 


5 • ' 


B 


-0.052 • 


f 1-029 


91.5 


0.5 




• 0 


.2.776 


0.626 


3 


14.5 




A 


-67.189* 


409.017* 


5- 


84.5 




- B 


0.014 


1.40'5 


93.5 


0 - 




0 


4.165 


0.742 


•1*5 


15.5 




* s - 3i' 

* » • * ' * 

ffi • . 

> - 4 
, ; - - - mm lAtt^M ^M^a^M — * MMMiMM g Mfl ^ | ^^^^^i^^g^js^^ 



